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Wind speed is the main component of wind power. Therefore, wind speed 
forecasting is of big importance due to its uses. It permits to plan the 
dispatch, determine the hours of storage needed, the amount of energy stored 
that should be used and avoid the big fluctuations in the electrical grid caused 
by the nature of the renewable energy resources. In this paper, we propose 
four hybrid models based on Support Vector Machine (SVM) and Artificial 
Neural Networks (ANNs) or just Neural Networks (NN) for wind 
speed forecasting. Using the Ordinary Least Squares (OLS) analysis for 
selecting the parameters more influencing wind speed. Then, a Support 
Vector Machine and Artificial Neural Networks models are tuned by Genetic 
Algorithm (GA) and Particle Swarm Optimization (PSO). The performance 
of these models is evaluated using three statistical indicators: the Mean 
Square Error (MSE), Mean Error (ME) and Mean Absolute Error (MAE). 
The results show a better performance of the neural model compared to the 


support vector machine. 
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1. INTRODUCTION 

Wind power is nowadays one of the predominant alternative energy sources since the energy 
produced by wind is clean, also it helps in reducing global warming and environmental pollution because it 
doesn't emit dangerous emissions (e.g., those produced by fossil fuel power stations that cause several human 
health issues). In this sense, the accurate forecast of wind speed is important for the safety of renewable 
energy utilization. At the present time, Morocco is the largest producer of wind power in Africa since it 
benefits from great solar and wind energy potential [1] because of its key geographic location [2]. 
In recent years, many methods have been developed to forecast wind speed. Classical and hybrid models [3] 
were subject to many works. In [4] presented a hybrid model that consisted of the Ensemble Empirical Mode 
Decomposition (EEMD) and the Genetic Algorithm- Backpropagation (GA-BP) neural network and 
compared their performance to Empirical Mode Decomposition (EMD) and the Genetic Algorithm- 
Backpropagation (GA-BP), traditional Genetic Algorithm-Backpropagation (GA-BP) and Wavelet Neural 
Network method (WNN) and The best model found was the hybrid EEMD and GA-BP neural network 
method (6.82% for MAPE and 0.59 for RMSE). On the other hand, in [5] designed a new hybrid model KF- 
ANN based on the artificial neural network (ANN) and Kalman filter (KF) and compared to Autoregressive 
Integrated Moving Average (ARIMA), Artificial Neural Network (ANN), hybrid ARIMA-KF models, the 
results showed that all these models were effective but the MAPE values indicated that the hybrid KF-ANN 
model was the most effective. In [6] exploited ANN and compared with persistence and ARIMA, the best 
model was a backpropagation network with one hidden layer and nine neurons (0.18 m/s for MAE and 
0.05m2/s2 for MSE). In [7] applied the evolutionary-SVM algorithm for short-term wind speed prediction in 
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wind turbines of Spanish wind farm. In [8] presented a combination method PSO-BP neural network and the 
results indicate that the proposed method is effective for the wind speed prediction more than the basic back 
propagation neural network and ARIMA model. 

Artificial neural networks based methods [9-10] prove to be the best for nonlinear problems due to 
its capacity of generalization and fitting any kind of functions which makes it suitable for any random 
variable based process in particular forecasting the renewable energy resources. Due to their nonlinear nature, 
artificial neural networks are used as part of the many forecasting hybrid model and we choose the SVM 
because it is shown to be efficient in addressing nonlinear input-output mapping [11]. On the other hand, the 
simplicity and flexibility of GA and PSO justify its use to optimize the NN connection weights and SVM 
parameters during the training phase. In this study, we find based on the Ordinary Least Squares (OLS) 
analysis four inputs influencing the wind speed, namely, direction, humidity, temperature and past wind 
speed. The data handled are the hourly meteorological satellite data from 2011 to 2013 (3 years) provided by 
the Research Institute of Solar Energy and New Energy (IRESEN). The region studied is Tangier located in 
the south Mediterranean ocean (located in the latitude 35.800, longitude —5.840 and altitude 8 in north 
Morocco). Afterwards, we apply the proposed hybrid models SVM-GA, SVM-PSO, NN-GA and NN-PSO to 
capture the relationship between the aforementioned inputs and the wind speed and to verify the performance 
of these models for the prediction of hourly wind speed in Tangier. The remainder of this article is organized 
as follows: Section 2 discusses in details the NN, SVM, GA and PSO. In section 3 experimental results 
analysis are given in detail, and the last section concludes this article. 


2. METHODOLOGY 

In this section, the methodologies and algorithms used for prediction are presented. 
The first subsection will be dedicated to technique used for analyzing the inputs. The remaind subsections 
will handle the Support Vector Machines, Artificial Neural Network and defining each algorithms used, 
namely, Genetic Algorithm and Particle Swarm Optimization. 


2.1. The ordinary least squares (OLS) 

The ordinary least squares (OLS) is a statistical method that fits a model using all of the 
independent variables. It estimates the linear regression coefficients by minimizing the vertical squared 
distance between the regression line and the data points [12-13]. In our models, no constant will be used 
because it affects the results since it has no meaningful physical interpretation. 


2.2. Support vector machines 

Support Vector Machines (SVMs) [14-19] are supervised learning models commonly used for 
classification problems. Given a set of data, the SVM will attempt to separate the two classes using a 
hyperplane, which is a subspace one dimension less than the ambient space. Various mathematical techniques 
such as dual quadratic programming and non-linear kernel transformations can be used to maximize the 
margin between the classes of data. A hard-margin SVM will enforce the principle that the data must be 
separable. However, it may not be possible to achieve perfect separation of the data in some cases. 
Therefore, a soft-margin SVM classifier can be used to maximize the hyperplane margin while allowing 
some misclassifications. In our approach, we adopt e-Support Vector Regression (e-SVR), the basic principle 
of SVM for regression is to map the data into a high dimensional feature space via nonlinear mapping, after 
which a linear regression is performed in this feature space. The regression formula can be expressed as: 


f(x)= Liki wi P;(x)+ b () 


where {®;(x)}/_, are named features and b is the bias term. The coefficients {w,}7_, can be obtained from 
the data by optimizing the following quadratic programming problem: 


site feats, ; 
Minimize > |lw|l* + C ia + &7) 


yi — (w, O(%})) —b Se + &; 
Subject to 4 (w, ®(;)) — yj tb set+& (2) 
§ 20,6 20i=1,....0 


where €;' is a slack variable and C is a positive constant determining the trade off between the flatness of f 
and the amount up to which deviations larger than ¢ are tolerated. Here, by solving the optimization problem 
through the introduction of Lagrange multipliers and optimality constraints: 
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f(x, a, a") = YP (oj_a7) k(x;,x) +b (3) 


where a,and a; are called Lagrange multipliers. They are obtained by maximizing the dual function of (2) 
as follows: 


: : * * 1 * * 
Maximize)jj- yi(@i—@) — € Li=1 (ai + @) — 5 ind y= (Gi — aj )(aj—a; )K (xj, X;) 


Liz1(@i—a@j) = 0 


Subject to{ 0<a,a7<C 


(4) 


For k(x;, x;), it denotes the kernel function equal to the inner product of two vectors x,;and x; in the 
feature space @(x,;)and P(x;), given by k( x;,x;) = D(x;). P(x;). Any function that satisfies Mercer’s 
condition [20] can be used as the kernel function and hence it depends on the problem at hand. 
In our approach, we opted Radial Basis Function (RBF) as the kernel function of SVR model with good 
analyticity and arbitrary rank derivatives existence, good performance under general smoothness 
assumptions. Further details concerning the e-SVR are presented in [21] and the expression of radial basis 
function is. 


2 
k(xj,2)) = exp(— By (5) 
2.3. Neural network 

Neural network is the modeling and prediction tool [22]. The network forms by connecting the 
output of certain neurons to the input of other neurons through synaptic weights used to store the knowledge. 
Any layer that is formed between the input layer and the output layer is called hidden layer. The output of the 
neural network is given by the following equation: 


= epuw 2 = Li Xi * With) (6) 
where Xi are the inputs, W; the weights and bj are the biases. It should be noted that the number of neurons in 
the input/output layer is related to the variables that are dealt with. 

The input/target datasets were divided randomly into two subsets. The first set is to train the neural 
network using the learning algorithm in order to find the optimal weights that minimize the Mean Square 
Error (MSE). The second one is to validate the already trained model [23]. 


2.4. Genetic algorithm 

A genetic algorithm is a stochastic general search method [24-27]. It was proposed by John Holland 
et al and widely applied in various forecasting and optimization fields. The genetic algorithm selects the 
individuals of random from the current population and these individuals use to produce the children for the 
next generation and repeatedly modifies a population of individual solutions. At each step, over successive 
generations, the population "evolves" toward an optimal solution. 


2.5. Particle swarm optimization 

The PSO is a new evolutionary computation method [28-32]. The system is initialized with a 
population of random solutions and searches for optima by updating generations. In PSO, the particles fly 
through the problem space by following the current optimum particles. The basic mathematical expressions 
of PSO are as follows: 


Veq(t +1) = Vea(t) + €,(t)7,(t) (Dealt) — Xea(t)) + C2(t)r2(t) (Pga(t) — Xsa(t)) 
Xsq(E + 1) = Xoq(t)+Vcq(E + 1) (7) 


where t is iteration number; random variable of r; and rz obey uniform distribution of the interval (0, 1); c1 (t) 


and c2 (t) are acceleration constants; Xsa (t) is the position of the particle S in the t iterations; psa (t) is the 
optimal location that all particles in the species search for in the t iterations. 


IJ-AI Vol. 8, No. 3, September 2019: 286 — 291 


IJ-AI ISSN: 2252-8938 o 289 


3. RESULTS AND DISCUSSION 

In this study, we tried to fit 5 inputs to wind speed, namely, Direction, past wind speed, Humidity, 
pressure and temperature, selecting the predictors is a necessary phase. In order to do that, some assumptions 
must be verified. The first one is to check the correlation between the dependent variable and the predictors 
in order to determine the degree of dependence of the predicted values to the predictors. A p-value of 0.05 or 
more means that it would not be rejected at the 5% level [33] and their value are given in Table 1. 
Consequently, you should consider removing pressure from the model of Tangier. 

The second one is to remove the multicollinearity between independent variables. Finally, 
four explanatory independent variable most influencing wind speed are direction, humidity, temperature and 
past wind speed to develop the proposed hybrid models [34]. The SVM model is used to perform wind 
speed forecasting. The parameters, cost (C), gamma (g), epsilon (e) has a significant impact on the 
forecasting performance, their values are listed in Tables 2 and 3. The Genetic Algorithm (GA) and Particle 
Swarm Optimization (PSO) are used to optimize the above three parameters by minimizing the MSE value. 
To evaluate the proposed hybrid approach and to determine quantitatively the best model, the coefficient of 
determination (R’) and three statistical indices are utilized to measure the forecasting accuracy. These indices 
are the Mean Error (ME), Mean Square Error (MSE) and Mean Absolute Error (MAE) between the predicted 
and the actual values of wind speed, then the errors are defined as: 


Table 1. Statistical analysis obtained from the OLS at Tangier station 


Parameter Coefficient value Standard error t-value P-value 
Direction -0.0000987504  0.0000144857 ~—--6.8171 0.0000 
Humidity -0.00127188 0.0000830857 = -15.308 ~—-0.0000 
Temperature 0.000407384 0.0000214706 18.9741 0.0000 
Past wind speed 0.990865 0.00101643 974.848 0.0000 
Pressure -0.000060839 0.0000722721 -0.84181 0.0039 
Table 2. Optimal parameters of SVM-GA Table 3. Optimal parameters of SVM-PSO 
City Values of parameters City Values of parameters 
Tangier  Cost(C) Gamma(g) Epsilon(e) Tangier Cost(C) Gamma(g) Epsilon(e) 
15.2749 3.144599 0.0006444887 10.8001 0.167599 0.0003052174 
1yN 
ME = ~Yi-1(i-¥)) (8) 
1yN 
MSE = —Yi=1(Ti — Yi)? (9) 
—1yNn 
MAE = © Yi=alTi — Yil (10) 
Bisa Miri? 
R=1-404 11 
dea(Ti-T)? (1) 


where T; are the targets, T is the mean of all target values, Yj are the neural network outputs and N is the 
number of samples. 

Table 4 shows the errors of the four hybrid models. The coefficient of determination (R-squared), 
also called goodness of fit, shows a very good quality of prediction for the NN-GA and NN-PSO models 
since the values are very close to 1. Additionally, the MSE, ME and MAE show a relatively low error in the 
models which can be explained by the fact that the models describe most of the data, but the NN-GA is the 
best suited for the prediction with MSE= 4.71 10*m?/s? with 3 neurons in the hidden layer. On the other 
hand, the R-squared of SVM-GA and SVM-PSO models is very close to 0. However, the MSE, ME and 
MAE show a relatively low error in the models which can be explained by the fact that the models describe 
most of the data but not with a high accuracy. Furthermore, Figure | and Figure 2 show a large gap between 
actual and predicted SVM-GA and SVM-PSO values, but in Figure 3 and Figure 4 the two curves are almost 
confused for NN-GA and NN-PSO. According to these results, we conclude that the NN-GA is able to 
predict the wind speed with a higher accuracy than the other models. 
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Table 4. Performance evaluation of different models for forecasts in Tangier 
Methods | MSE(m7/s?) | ME(m/s) | MAE(m/s) _ R-squared 
SVMGA 2.42 10° 6 10° 0.1238 0.1947 
SVMPSO 2.60 107 1.31 10° 0.1281 0.1355 
NNPSO 5.27 104 4.0765 10° 0.0124 0.9881 
NNGA 4.71 107 0.0169 9.0892 104 0.9834 
0.8 T Tc 0.8 © © © © 
——— SVM-PSO predicted value ——— SVM-GA predicted value 
> 06 actual value | > 06 actual value | 
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Figure 1. The actual/forecast wind speed Figure 2. The actual/ forecast wind speed 
by SVM-PSO by SVM-GA 
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Figure 3. The actual/forecast wind speed by NN-PSO _ Figure 4. The actual/forecast wind speed by NN-GA 


4. CONCLUSION 

The objective of this study was the prediction of wind speed using four hybrid models. In the first, 
we find based on the OLS analysis four inputs influencing the wind speed, namely, direction, humidity, 
temperature and past wind speed. Afterwards, we apply the proposed hybrid models NN-GA, SVM-GA, 
SVM-PSO and NN-PSO to capture the relationship between the aforementioned inputs and the wind speed. 
The performance of these models is evaluated using R-squared and three statistical indicators: the Mean 
Square Error (MSE), Mean Error (ME) and Mean Absolute Error (MAE). The conclusion made is that the 
SVM-GA and SVM-PSO are not suitable for estimating the behavior of wind speed in the region of Tangier 
using the selected inputs. On the other hand, the nonlinear method of neural networks tuned by genetic 
algorithm using the same inputs gives excellent results which indicate its high accuracy in the prediction of 
wind speed. 
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